Large-scale Polish SLU

نویسندگان

  • Patrick Lehnen
  • Stefan Hahn
  • Hermann Ney
  • Agnieszka Mykowiecka
چکیده

In this paper, we present state-of-the art concept tagging results on a new corpus for Polish SLU. For this language, it is the first large-scale corpus (~200 different concepts) which has been semantically annotated and will be made publicly available. Conditional Random Fields have proven to lead to best results for string-to-string translation problems. Using this approach, we achieve a concept error rate of 22.6% on an evaluation corpus. To additionally extract attribute values, a combination of a statistical and a rule-based approach is used leading to a CER of 30.2%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Design and Data Collection for Spoken Polish Dialogs Database

Spoken corpora provide a critical resource for research, development and evaluation of spoken dialog systems. This paper describes the telephone spoken dialog corpus for Polish created by Polish-Japanese Institute of Information Technology team within the LUNA project (IST 033549). The main goal of this project is to create a robust natural spoken language understanding (SLU) toolkit, which can...

متن کامل

Multi-Task Learning for parsing the Alexa Meaning Representation Language

The Alexa Meaning Representation Language (AMRL) is a compositional graph-based semantic representation that includes fine-grained types, properties, actions, and roles and can represent a wide variety of spoken language. AMRL increases the ability of virtual assistants to represent more complex requests, including logical and conditional statements as well as ones with nested clauses. Due to t...

متن کامل

Efficient learning for spoken language understanding tasks with word embedding based pre-training

Spoken language understanding (SLU) tasks such as goal estimation and intention identification from user’s commands are essential components in spoken dialog systems. In recent years, neural network approaches have shown great success in various SLU tasks. However, one major difficulty of SLU is that the annotation of collected data can be expensive. Often this results in insufficient data bein...

متن کامل

Cooperative Spoken Language Understanding for Robust Speech Translation

This paper argues that the time is now right to field practical and robust Spoken Language Understanding (SLU) systems. It argues that, at the present state of the art, robustness can best be achieved through user cooperation and compromise with the system. If this insight guides design, several sorts of reliable SLU systems can be deployed over the next few years. Further, SLU systems can be a...

متن کامل

Recurrent Polynomial Network for Dialogue State Tracking with Mismatched Semantic Parsers

Recently, constrained Markov Bayesian polynomial (CMBP) has been proposed as a data-driven rule-based model for dialog state tracking (DST). CMBP is an approach to bridge rule-based models and statistical models. Recurrent Polynomial Network (RPN) is a recent statistical framework taking advantages of rulebased models and can achieve state-ofthe-art performance on the data corpora of DSTC-3, ou...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009